Text analysis example

Published

January 27, 2025

Word text analysis

For the text from text found on https://www.redfunnel.co.uk some analysis has been completed. This is shown below.

Table of words used

Below is the table of words and associated counts.

Estimated similar words

Below is the table of estimated similar words used (first 3 letters are the same).

Number of characters per sentence/paragraph

Types of characters used

The number and types of character used are indicated below.

Count of words - top 10

The top 10 words found by their number of occurrences within the text is shown below. There are 206 unique words used.

Wordcloud

A wordcloud of the text is provided below.

Comparison wordcloud

A wordcloud of the sentiment of text is provided below.

TF-IDF of unigram (single) words

This graph shows the tf-idf (term-frequency inverse-document frequency) of single words. Words that repeat often have a lower score, and a higher count of words.

`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Count of two word phrases - top 5

The top 5 2-word phrases, by the number of times they appear are below. There are 387 two word phrases.

Count of unique phrases by type

It may not look like it, but there are 1094 datapoints on this graph.
Of these:
Unigram is: 265 which is 24% of the total
Bigram is: 423 which is 39% of the total
Trigram is: 406 which is 37% of the total

Wordcloud - two word phrases

A wordcloud of the text is provided below.

Comparison two-phrase wordcloud

A wordcloud of the sentiment of text is provided below.

TF-IDF of bigram (two) words

This graph shows the tf-idf (term-frequency inverse-document frequency) of single words. Words that repeat often have a lower score, and a higher count of words.

`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Network graph of bigrams that occur more than once

This network graph shows the linkages used between two word phrases which occur more than once in the text.

Sentiment of text

The sentiment types of the text are shown below. There are 3 types of sentiment and the sentiment types are: Neutral, Positive, Negative.

Below is the table of text and associated sentiment.

Topic modelling

Estimated topics from the text, highest 10 scoring words on graph. There are 4 topics chosen.

Top 3 sentences

The top 3 scoring (summary extraction) sentences are below.

Word associations for top 2 words

Word 1 - isle

Word 1 is isle and the top 10 words associated with it are below

Word 2 - wight

Word 2 is wight and the top 10 words associated with it are below

Get value and sentiment of words preceded by these top 2 words

Word 1 - isle

Word 2 - wight

Get value and sentiment of words followed by these top 2 words

Word 1 - isle

Word 2 - wight